I have been working with NLTK to develop a classifying technique. I have 2 datasets that could be considered like movie_reviews datasets (pos/neg). I managed to use the NaiveBayesClassifier, however, I need the algorithm to take an input and classify whether it is pos and neg. I have not found a way to do that. Any Ideas on how to change the classifier to classify the input against the datasets (pos/neg)? Any help would be appreciated.
import nltk from nltk.classify import NaiveBayesClassifier from nltk.corpus import CategorizedPlaintextCorpusReader
documents = [ (list(mr.words(fileid)), category) for category in mr.categories() for fileid in mr.fileids(category) ]
def word_feats(words): return dict([(word, True) for word in words])
negids = mr.fileids('neg') posids = mr.fileids('pos') negfeats = [(word_feats(mr.words(fileids=[f])), 'neg') for f in negids] posfeats = [(word_feats(mr.words(fileids=[f])), 'pos') for f in posids]